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Despite the fundamental contribution of the gut microbiota to host physiology, the extent of its variation in 
genetically-identical animals used in research is not known. We report significant divergence in both the 
composition and metabolism of gut microbiota in genetically-identical adult C57BL/6 mice housed in 
separate controlled units within a single commercial production facility. The reported divergence in gut 
microbiota has the potential to confound experimental studies using mammalian models. 

Researchers using animal models are becoming increasingly aware of possible influences of the gut micro- 
biota on physiology. Murine models have been used to demonstrate relationships between the gut micro- 
biota and obesity', metabolic disease'^, cardiovascular health'', nervous system developmenf, diabetes^, and 
immune function'', hepatic function', inflammatory bowel conditions", and carcinogenesis', highlighting the 
potential impact that differences in the microbiome of mice from different animal facilities could have on 
research. However, most researchers assume that genetically-identical mice derived from a single supplier wiU 
have an equivalent microbiome. To test this assumption we studied the faecal microbiome and metabolome of 
genetically-identical C57BL/6 mice housed in four separate controlled units within a single facility of a commer- 
cial supplier of animals for research. Faecal samples were collected at eight weeks of age from twenty mice, with 
five mice sampled in each of four barrier rooms. These mice were separated by no more than ten generations. 

Methods 

Murine faecal samples. Faeces were collected from eight week old C57BL/6 at the Charles River commercial facility (Margate, UK) under 
commercial licence, with all mice kept in accordance with protocols approved by The Animal Health and Welfare Board for England. 
Samples were collected from 20 mice, housed in four separate barrier rooms within the facility, fed the same chow (a VRFl diet, SDS). The 
five mice sampled in each room were housed in separate cages. The five mice from each of the four rooms were taken from separate cages i.e. 
no two mice came from the same cage. Mice in this study were handled by individuals wearing gloves for cage cleaning purposes on a weekly 
basis. Mice were not housed exclusively with litter mates, with 27 individuals housed per room. Samples consisted of individual faecal pellets 
taken from individual mice. After collection, pellets were placed into separate collection tubes and frozen prior to analysis. 

Microbiota. Nucleic acid extractions were carried out using a combination of physical disruption and phenol/ chloroform extraction 
methods, described previously'". 16S rRNA gene universal Bacterial primers 27F-519R (27F 5'-AGRGTTTGATCMTGGCTCAG, 519R 5'- 
GTNTTACNGCGGCKGCTG) were used in a single-step 30 cycle PGR using HotStarTaq Plus Master Mb; Kit (Qiagen, Valencia, GA) 
performed under the following conditions: 94oG for 5 minutes, followed by 28 cycles of 94oG for 30 seconds, 53oG for 40 seconds, and 72oC 
for 1 minute. Amplification was followed by a final elongation step at 72oG for 5 minutes. Following PGR, all amplicon products from 
different samples were mixed in equal concentrations and purified using Agencourt Ampure beads { Agencourt Bioscience Corporation, MA, 
USA). Samples were sequenced utilizing Roche 454 FLX titanium instruments and reagents following manufacturer's guidelines. A total of 
165,934 16S rRNA gene sequences were obtained from the 20 faecal sample extracts. Following curation, an average of 4,356 sequences was 
obtained for each of the samples. For analysis of alpha and beta diversity, samples were normalised to 2,179 sequences per sample. 

Sequence data analysis was carried out. Here, the Q25 sequence data derived from the sequencing process was processed using standard 
analysis pipeline processes {MR DNA, Shallowater, USA). Sequences were depleted of barcodes and primers then short sequences <200 bp 
removed, as were sequences with ambiguous base calls removed, and sequences with homopolymer runs exceeding 6 bp, sequences were 
denoised and chimeras removed''"'^. Operational taxonomic units were defined after removal of singleton sequences, clustering at 3% 
divergence (97% similarity). Final OTUs were taxonomically classified using BLASTn against a curated databased derived from GreenGenes, 
NGBI and RDP databases"*. Normalized and de-noised files were then rarefied and run through QIIME'^ to generate alpha and beta diversity 
data. Additional statistical analyses were performed with NCSS2007 (NCSS, UT) and XLstat 2012 (Addinsoft, NY). 
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A range of diversity and richness measures were used to assess changes in micro- 
biota composition, including taxa richness, Chaol, Shannon index, Simpson index 
1 -D^". Analysis of microbiota diversity was performed using PAST - Palaeontological 
Statistics, version 3.01, a program available from the University of Oslo website link 
(http://folk.uio.no/ohammer/past). 

NMR metabolomics. Portions of mouse faeces of approximately 0.02 g were 
resuspended by vortexing in 500 [il of phosphate buffered saline. Particulate matter 
was pelleted by centrifugation at 13,000 X ^ for 10 min, and supernatant transferred 
to a fresh microfuge tube. Centrifugation was repeated, with pelleted material again 
discarded. Supernatant was frozen by immersion in liquid nitrogen, lyophilised at 
— 58^C overnight, and re-suspended in 500 |il D2O. NMR spectra of three 
replicates were acquired at 400 MHz on a Bruker Avance spectrometer {Bruker, 
Coventry, UK) equipped with a 5 mm QNP probe using a zgesgp pulse sequence 
incorporating water suppression via excitation sculpting with gradients. The 90 
degree pulse was 9.75 [is. The spectral width was 20 ppm. Free induction decays were 
multiplied with an exponential function corresponding to a line broadening of 
0.3 Hz. The spectra were Fourier transformed and calibrated to a 2,2,3,3,-D4-3- 
(TrimethylsUyl) propionic acid (TSP) reference signal at 0 ppm. Phase correction was 
performed manually and automatic baseline correction was applied. To help in the 
assignment of the metabolite resonances, J-resolved 2D correlation was performed 
with pre- saturation during relaxation delay using gradients (J-Res, Bruker). Pre- 
processing and orthogonal projection to latent structures discriminant analysis 
(OPLS-DA) were carried out with software that was developed in our laboratory for a 
previous study^^ using the python programming language with numpy and scipy for 
calculations, and matplotlib for visualization. The nonlinear iterative partial least- 
squares (NIPALS) algorithm*^** was used for OPLS-DA analysis. Regions above 
8.5 ppm and below 0.45 ppm were excluded because of noise content. The water peak 
and TSP reference signal were also excluded. Spectra were bucketed using 0.005 ppm 
bin size leaving 1588 data points per spectrum. These spectra were normalized^'' 
and auto-scaled (variance of every data point normalized to 1). Cross-validation was 
performed where 75% of the samples were used as a training set and the remaining 
25% as a test set, ensuring that the number of samples in the test set was proportional 
to the total number of samples from each class, and that at least one sample from each 
class was present in the test set. To choose the number of components for the model, a 
leave-one-out cross-validation was carried out on the samples in the training set, and 
the Fl used to choose the number of components, with the additional constraint to 
use a maximum of 8 components. A double cross-validation was repeated 2000 times 
with randomly chosen samples in the training and test set to prevent bias due to the 
choice of training or test set. This led to 4 X 2000 models. Finally, this procedure was 
repeated with randomly generated class assignments to provide a reference value for 
Q^. The chosen number of components minus one was then used as an OPLS filter, 
and a PLS-DA analysis with two components was carried out on the filtered data to 
yield one predictive and one orthogonal component. In the back-scaled loadings 
analysis, peaks that allow the models to distinguish between classes were assigned by 
comparing chemical shift values and multiplicities from J-resolved NMR spectra to 
values from the BMRB^= and HMDB". 

Results & Discussion 

Analysis of the bacterial identities derived from 16S ribosomal RNA 
gene sequencing revealed the faecal microbiota to be dominated by 
the phyla Bacteroidetes and Firmicutes, (62.4 ± 22.4 (SD)% and 34.7 
± 23.9%, respectively) although marked variation was observed in 
phylum relative abundance between individual animals (Fig. SI). 
Further, microbiota alpha diversity, as assessed by rarefaction and 
Chaol richness estimate, OTU richness, and Shannon Index were 
significantly lower for mice of one room group (room 4) compared 
with mice from other room groups (Table SI) (Kruskall-Wallis con- 
trolled multiple pair-wise comparison, p < 0.001). 

Analysis of microbiota at the genus level identified the twenty 
genera with the highest mean relative abundance (Table 82), which 
were broadly in keeping with those reported in the murine gut prev- 
iously^^. The most commonly numerically dominant genus was 
Prevotella, (39.0 ± 20.2% of sequences), a genus associated with a 
long term carbohydrate-rich diet in humans^". Again however, sig- 
nificant differences in the microbiota were identified between room 
groups (controlled ANOVA tests, p < 0.05) (Table S3). Coprococcus, 
Ruminococcus, and Anaerotruncus were significantly higher in room 
1 samples, Pedobacter was significantly higher in room 2 samples, 
Novispirillum was significantly higher in room 3 samples and 
Prevotella was significantly higher in room 4 samples. Samples from 
rooms 2 and 3 groups had significantly higher abundance of 
Parabacteroides and Sphingobacterium than samples from rooms 1 
and 4. 



Hierarchal cluster analysis based upon the predominant genera 
indicates divergence in the composition of the microbiota into three 
clusters (Fig. 1). Cluster I comprised samples from all animals from 
room 3 and additional animals from rooms 1 and 2, cluster II com- 
prised all animals from room 4 and cluster III included all of the 
remaining animals from rooms 1 and 2. Notably is the absence, or 
very low abundance, in room group 4 of a number of genera includ- 
ing Sutterella, Sphingobacterium, Novispirillum and Porphyromonas. 
Overall therefore, the bacterial microbiota showed marked diver- 
gence that was in cases linked to room occupancy, with these com- 
positional differences resolving into three clusters. 

Whilst all mice received the same standard diet, the differences in 
constituency of their microbiota indicated a potential for distinct 
metabolomic characteristics. The major constituent of mouse chow, 
carbohydrates, are fermented in the colon to short chain fatty acids 
(SCFA), primarily acetate, butyrate, lactate and propionate^'''". 
Whilst SCFAs are just one class of compounds, they are important 
in shaping the microbial community and preventing the growth of 
pathogens'"''^. Moreover, SCFA levels impact on the host and are 
known to be important in relation to nutrition, adipose tissue depos- 
ition, immunity and cancer amongst other conditions'"'''. Different 
SCFAs have been associated with effects on specific physiological 
processes'*, with the type of SCFAs varying between bacterial gen- 
era". To test for a functionally-distinct signal, we performed a meta- 
bolomic analysis of the faecal material. 

'H NMR spectroscopy was performed on buffered saline extracts 
from the same faecal samples used for microbiota sequencing. We 
hypothesised that there would be differences when comparing the 
metabolome of faeces from mice whose faecal microbiota were dis- 
tinct. Analysis involved a series of pairwise orthogonal partial least 
squares discriminant analysis (OPLS-DA) tests using classes sug- 
gested by clustering according to microbiota (Fig. 2), room occu- 
pancy or dominant phyla. Scores plots for each of three pairwise 
comparisons show that there are substantial differences in the meta- 
bolomes extracted from faeces of mice assigned to each cluster (Fig. 2 
- left panels). Q'' obtained for each test performed were compared 
with a reference value for Q^, obtained after repeating cross-valid- 
ation with randomly generated class assignments (Table 1). As 
shown, scores for the metabolomic data pairwise analysis per- 
formed when separated according to these clusters were >0.50 which 
is an accepted threshold for a "good" model""'". As such, we 
observed clear metabolomic differences in the murine faecal samples 
based on clusters as defined by the composition of the bacteria 
present. Further, microbiota data were used to assess the relative 
contribution of Bacteroidetes, Firmicutes and Proteobacteria to each 
of the samples tested. Here, scores were all >0.41. Significant 
differences were also identified in the metabolome of faeces from 
mice housed in different room groups with scores all >0.67 
(room 2 vs. room 3). 

Next, we identified the key drivers of the differences in the meta- 
bolomic data by generating back-scaled loadings plots and assigning 
resonances with high variance and high weight, indicated by greater 
intensity and yellow/red color respectively (Fig. 2 - right panels). 
Notably, Clusters I and II were distinguished by the greater abund- 
ance of a number of amino acids in the faecal metabolomes of mice in 
Cluster II whereas the faecal metabolomes of mice from Cluster III 
were distinguished from those in Cluster I and II on the basis of short 
chain fatty acids which were more abundant in Cluster III. At the 
outset of this study, we hypothesised that there would be minimal 
differences between the gut microbiota as sampled in the context of 
genetically identical mice. However, significant differences were 
observed in the taxa detected, their relative abundance, and overall 
bacterial diversity. This variation in the faecal microbiota was linked, 
at least in part, to the barrier room in which the mice were housed. 
Assessment of the metabolome associated with these animals showed 
that microbiota and metabolome findings were largely consistent. 
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Figure 1 | Heat map analysis of the predominant genera identified in this study. A hierarchal cluster diagram was constructed using Ward's minimum 
variance clustering and Manhattan distances. Room group 1 and room group 2 exhibit some co-clustering indicating differences within the groups. The 
heatmap describes the relative percentage in each sample of the associated genera with a legend provided in the upper left of the figure. 



Murine models are used in biomedical research to address almost 
every aspect of human health. To avoid potentially confounding 
differences in genetic backgrounds, mice are taken from inbred 
populations with the rationale being that the resulting homogeneity 
provides a uniform "platform" for study. By far the most common 
genetic background for mice used as models of human disease is the 
strain C57BL/6, as used here. When purchased for research, indi- 
vidual C57BL/6 mice are commonly considered to be equivalent. 
Increasingly however, the potential of the gastrointestinal microbiota 
to influence the host in relation to health and a wide range of clinical 



syndromes is being recognised'". In this light, the differences iden- 
tified in microbiota here require further consideration. Given the 
potential impact of the gut microbiota on so many important physio- 
logical processes, the degree to which it is conserved between indi- 
vidual animals used in biological research is arguably as important as 
their genetic uniformity. Further, variation in gut microbiota com- 
position is likely to be even higher in less well controlled experi- 
mental facilities, and to be exacerbated when mice are moved 
between facilities, experience changes in diet, and are exposed to 
animals with different microbiota. The divergence in gut microbiota 
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Figure 2 | OPLS-DA scores plots (left panels) and back-scaled loadings plots (right panels) for comparisons between the murine faecal metabolomes as 
clustered according to microbiota data composition. Resonances with high variance and high weight are highUghted in red. The distinguishing 
metabolites that could be unambiguously assigned are armotated in each back-scaled loadings plot. values for the cross-validated OPLS-DA 
comparisons are provided in Table 1 . 
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Table 1 | Predictive values for all models. values for models 
run with permutated class assignments are given in parentheses 


Model 






Cluster 1 vs Cluster II 


0.88 


(-0.15) 


^1 L 1 /^l 1 III 

Cluster 1 vs Cluster III 


0.52 


(-0.15) 


^1 1. II ("1 1 III 

Cluster II vs Cluster HI 


0.81 


(-0.18) 


Room 1 vs Room 2 


0.93 


(-0.14) 


Room 1 vs Room 3 


0.90 


(-0.15) 


Room 1 vs Room 4 


0.85 


(-0.15) 


Room 2 vs Room 3 


0.67 


(-0.09) 


Room 2 vs Room 4 


0.80 


(-0.12) 


Room 3 vs Room 4 


0.86 


(-0.15) 


High Bocteroidetes vs low Bocteroidetes 


0.41 


(-0.15) 


High Firmicutes vs low Firmicutes 


0.41 


(-0.17) 


High Proteobocteria vs low Proteobacterio 


0.66 


(-0.18) 



composition, as reflected in faecal bacteria, strongly suggests that 
efforts must be made to ensure uniformity of intestinal microbiota 
in animals used in research. 
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